AITopics | label noise

Collaborating Authors

label noise

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Signal-to-Noise Ratio and Sample Size Govern Representational Alignment in Neural Networks

Umar, Ali Hussaini, Laio, Alessandro

arXiv.org Machine LearningMay-27-2026

Neural networks are known to develop latent representations that are $aligned$, namely structurally similar across networks trained with different architectures, training protocols, or training datasets. We study this phenomenon in a controlled setting, where we train an ensemble of networks on regression and classification tasks using training sets perturbed by independent realizations of a noise process. We show that the signal-to-noise ratio (SNR) and the training sample size influence the alignment in qualitatively similar ways in networks trained on real-world datasets and in an extremely simple $linear$ network with a single hidden layer, for which the alignment can be estimated analytically. Across linear and nonlinear networks, regression and classification tasks, and both synthetic and real-world data, we consistently observe that alignment varies monotonically with SNR but non-monotonically with training sample size. In particular, the alignment is minimized near the interpolation threshold, and a stronger alignment does not necessarily correspond to better generalization error. These findings reveal a non-trivial dependence of alignment on data quality and quantity, decoupled from generalization performance.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Machine Learning

2605.26973

Country: Europe > Italy (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Symmetrization of Loss Functions for Robust Training of Neural Networks in the Presence of Noisy Labels

Paquin, Alexandre Lemire, Chaib-Draa, Brahim, Giguère, Philippe

arXiv.org Machine LearningMay-21-2026

Labeling a training set is often expensive and susceptible to errors, making the design of robust loss functions for label noise an important problem. The symmetry condition provides theoretical guarantees for robustness to such noise. In this work, we study a symmetrization method arising from the unique decomposition of any multi-class loss function into a symmetric component and a class-insensitive term. In particular, symmetrizing the cross-entropy loss leads to a linear multi-class extension of the unhinged loss. Unlike in the binary case, the multi-class version must have specific coefficients in order to satisfy the symmetry condition. Under suitable assumptions, we show that this multi-class unhinged loss is the unique convex multi-class symmetric loss. We also show that it has a fundamental local role: the linear approximation of any symmetric loss around score vectors with equal components is equivalent to the multi-class unhinged loss. We then introduce SGCE and alpha-MAE, two loss functions that interpolate between the multi-class unhinged loss and the Mean Absolute Error while allowing control of the beta-smoothness of the loss. Experiments on standard noisy-label benchmarks show competitive performance compared with existing robust loss functions.

artificial intelligence, loss function, machine learning, (18 more...)

arXiv.org Machine Learning

2605.20347

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Unveiling Memorization-Generalization Coexistence: A Case Study on Arithmetic Tasks with Label Noise

Liu, Linyu, Lu, Pinyan

arXiv.org Machine LearningMay-19-2026

Highly over-parameterized models can simultaneously memorize noisy labels and generalize well, yet how these behaviors coexist remains poorly understood. In this work, we investigate the underlying mechanisms of this coexistence using modular arithmetic tasks under heavy label noise. Through extensive experiments on two-layer neural networks, we find that larger models tend to generalize better under appropriate optimization and model configurations, while noisy labels are memorized faster than clean data. Over-parameterized models internally form a generalization structure, but its expression in the output is suppressed by the need to fit noisy labels. Remarkably, even with 80\% label noise, near-perfect test accuracy can be achieved by extracting this internal structure using frequency-based methods. We further propose a task-agnostic method to partition networks into generalization and memorization components. Although this subnetwork improves generalization, it is limited compared with frequency-based extraction, indicating that the generalization structure is distributed across neurons and motivating the development of new tools to retrieve generalizable knowledge from over-parameterized networks.

artificial intelligence, generalization, machine learning, (16 more...)

arXiv.org Machine Learning

2605.18022

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.64)

Add feedback

fc20ea8d104cab737a5561096f9bde9b-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMay-1-2026, 05:15:41 GMT

AQuAcurrently includes a variety of datasets for different classification problems, varying in the number of classes, sources of annotations, and data modalities. All datasets except those marked with are multi-class.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

fc20ea8d104cab737a5561096f9bde9b-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMay-1-2026, 05:15:38 GMT

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Asia (0.93)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(5 more...)

Add feedback

5fc47800ee5b30b8777fdd30abcaaf3b-Supplemental-Conference.pdf

Neural Information Processing SystemsMay-1-2026, 03:37:50 GMT

Having defined and validated the pairwise feedback simulator and evaluations in AlpacaFarm, we569 now turn our attention to studying methods that learn from pairwise feedback on AlpacaFarm.570 Unfortunately, the lack of existing benchmarks for learning from pairwise feedback for instruction571 following means that there has not been any open study of these methods in the instruction-following572 setting. In the remainder of this section, we will introduce our reference methods, which fall into two575 categories based on whether they fit a surrogate reward model as part of the learning process.576 FeedME is a method proposed by OpenAI [45] that incorporates human feedback578 with supervised fine-tuning on model generations that are rated 7/7 by human labelers. We adapt579 this approach to the pairwise feedback setting and call this baseline binary FeedME. This approach580 fine-tunes the SFT model on the chosen response in each preference pair with supervised learning.581 Motivated by controllable generation through conditioning [27, 34,582 29, 21], we propose binary reward conditioning, a baseline method that fine-tunes the SFT model583 with the feedback data Dpairwise by conditioning instances with either a positive or negative control584 token. Specifically, for each instance (x,y0,y1,z) 2D pairwise, the string concatenation of instruction585 x and response yz denoted as [x,yz] is prepended with the positive token and used in supervised586 fine-tuning (similarly [x,y1 z]is prepended with the negative token). This process creates a modified587 demonstration dataset that is double the size of Dpairwise. At test time, we draw samples from the588 fine-tuned model conditioned on the positive token.589 A.2 Methods that optimize a surrogate reward function590 We now describe methods that incorporate feedback by first building a surrogate reward model with591 pairwise feedback data. To start, we describe the step of training the surrogate reward model.592 While this can be a powerful approach,596 we will see that it can also lead to over-optimization [19] where models learn to exploit the reward597 model rather than achieve high true reward. We now describe 4 methods that leverage the surrogate598 reward model.599

annotator, large language model, machine learning, (22 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.52)

Add feedback

ebb6bee50913ba7e1efeb91a1d47a002-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 04:54:13 GMT

artificial intelligence, machine learning, obj, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Enhancing Sharpness-Aware Optimization Through Variance Suppression

Neural Information Processing SystemsApr-30-2026, 01:21:06 GMT

Sharpness-aware minimization (SAM) has well documented merits in enhancing generalization of deep neural networks, even without sizable data augmentation. Embracing the geometry of the loss function, where neighborhoods of'flat minima' heighten generalization ability, SAM seeks'flat valleys' by minimizing the maximum loss caused by an adversary perturbing parameters within the neighborhood. Although critical to account for sharpness of the loss function, such an'over-friendly adversary' can curtail the outmost level of generalization. The novel approach of this contribution fosters stabilization of adversaries through variance suppression (VaSSO) to avoid such friendliness.

machine learning, natural language, vasso, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > Promising Solution (0.48)

Technology: